sliced wasserstein generative model
Supplement to Amortized Projection Optimization for Sliced Wasserstein Generative Models
PRW can be seen as the generalization of Max-SW since PRW with k =1 is equivalent to Max-SW. Similar to Max-SW, the optimization of PRW is solved by using projected gradient ascent. The detailed of the algorithm is given in Algorithm 4. We would like to recall that other methods of optimization have also been used to solved PRW such as Riemannian optimization [28], block coordinate descent [21]. However, in this paper, we consider the original and simplest method which is projected gradient ascent.
Amortized Projection Optimization for Sliced Wasserstein Generative Models
Seeking informative projecting directions has been an important task in utilizing sliced Wasserstein distance in applications. However, finding these directions usually requires an iterative optimization procedure over the space of projecting directions, which is computationally expensive. Moreover, the computational issue is even more severe in deep learning applications, where computing the distance between two mini-batch probability measures is repeated several times. This nested-loop has been one of the main challenges that prevent the usage of sliced Wasserstein distances based on good projections in practice. To address this challenge, we propose to utilize the \textit{learning-to-optimize} technique or \textit{amortized optimization} to predict the informative direction of any given two mini-batch probability measures. To the best of our knowledge, this is the first work that bridges amortized optimization and sliced Wasserstein generative models. In particular, we derive linear amortized models, generalized linear amortized models, and non-linear amortized models which are corresponding to three types of novel mini-batch losses, named \emph{amortized sliced Wasserstein}. We demonstrate the favorable performance of the proposed sliced losses in deep generative modeling on standard benchmark datasets.
Amortized Projection Optimization for Sliced Wasserstein Generative Models
Seeking informative projecting directions has been an important task in utilizing sliced Wasserstein distance in applications. However, finding these directions usually requires an iterative optimization procedure over the space of projecting directions, which is computationally expensive. Moreover, the computational issue is even more severe in deep learning applications, where computing the distance between two mini-batch probability measures is repeated several times. This nested-loop has been one of the main challenges that prevent the usage of sliced Wasserstein distances based on good projections in practice. To address this challenge, we propose to utilize the \textit{learning-to-optimize} technique or \textit{amortized optimization} to predict the informative direction of any given two mini-batch probability measures.
Sliced Wasserstein Generative Models
Wu, Jiqing, Huang, Zhiwu, Li, Wen, Thoma, Janine, Van Gool, Luc
In the paper, we introduce a model of sliced optimal transport (SOT), which measures the distribution affinity with sliced Wasserstein distance (SWD). Since SWD enjoys the property of factorizing high-dimensional joint distributions into their multiple one-dimensional marginal distributions, its dual and primal forms can be approximated easier compared to Wasserstein distance (WD). Thus, we propose two types of differentiable SOT blocks to equip modern generative frameworks---Auto-Encoders (AEs) and Generative Adversarial Networks (GANs)---with the primal and dual forms of SWD. The superiority of our SWAE and SWGAN over the state-of-the-art generative models is studied both qualitatively and quantitatively on standard benchmarks.